Picture for Pengfei Wan

Pengfei Wan

Geometry-Aware Implicit Memory for Video World Models

Add code
Jun 01, 2026
Viaarxiv icon

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

Add code
Jun 01, 2026
Viaarxiv icon

DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory

Add code
May 29, 2026
Viaarxiv icon

LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV

Add code
May 25, 2026
Viaarxiv icon

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

Add code
May 21, 2026
Viaarxiv icon

SRC-Flow: Compact Semantic Representations Enable Normalizing Flows for Image Generation

Add code
May 18, 2026
Viaarxiv icon

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos

Add code
May 18, 2026
Viaarxiv icon

FU-MPC: Frontier- and Uncertainty-Aware Model Predictive Control for Efficient and Accurate UAV Exploration with Motorized LiDAR

Add code
May 14, 2026
Viaarxiv icon

UniCustom: Unified Visual Conditioning for Multi-Reference Image Generation

Add code
May 13, 2026
Viaarxiv icon

Omni-o3: Deep Nested Omnimodal Deduction for Deliberative Audio-Visual Reasoning

Add code
Apr 27, 2026
Viaarxiv icon